Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 52(D1): D938-D949, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38000386

RESUMO

Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch's APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch's data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch's analytic tools by developing a customized plugin for OpenAI's ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app.


Assuntos
Bases de Dados Factuais , Doença , Genes , Fenótipo , Humanos , Internet , Bases de Dados Factuais/normas , Software , Genes/genética , Doença/genética
2.
Bioinformatics ; 39(7)2023 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-37389415

RESUMO

MOTIVATION: Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of KGs is lacking. RESULTS: Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of KGs. Features include a simple, modular extract-transform-load pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate KGs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph ML, including node embeddings and training of models for link prediction and node classification. AVAILABILITY AND IMPLEMENTATION: https://kghub.org.


Assuntos
Ontologias Biológicas , COVID-19 , Humanos , Reconhecimento Automatizado de Padrão , Doenças Raras , Aprendizado de Máquina
3.
Clin Transl Sci ; 15(8): 1848-1855, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-36125173

RESUMO

Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open-source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.


Assuntos
Reconhecimento Automatizado de Padrão , Ciência Translacional Biomédica , Conhecimento
4.
Database (Oxford) ; 20222022 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-35616100

RESUMO

Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Or are they associated in some other way? Such relationships between the mapped terms are often not documented, which leads to incorrect assumptions and makes them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Furthermore, the lack of descriptions of how mappings were done makes it hard to combine and reconcile mappings, particularly curated and automated ones. We have developed the Simple Standard for Sharing Ontological Mappings (SSSOM) which addresses these problems by: (i) Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in mappings explicit. (ii) Defining an easy-to-use simple table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data principles. (iii) Implementing open and community-driven collaborative workflows that are designed to evolve the standard continuously to address changing requirements and mapping practices. (iv) Providing reference tools and software libraries for working with the standard. In this paper, we present the SSSOM standard, describe several use cases in detail and survey some of the existing work on standardizing the exchange of mappings, with the goal of making mappings Findable, Accessible, Interoperable and Reusable (FAIR). The SSSOM specification can be found at http://w3id.org/sssom/spec. Database URL: http://w3id.org/sssom/spec.


Assuntos
Metadados , Web Semântica , Gerenciamento de Dados , Bases de Dados Factuais , Fluxo de Trabalho
5.
Genet Med ; 24(7): 1512-1522, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35442193

RESUMO

PURPOSE: Genomic test results, regardless of laboratory variant classification, require clinical practitioners to judge the applicability of a variant for medical decisions. Teaching and standardizing clinical interpretation of genomic variation calls for a methodology or tool. METHODS: To generate such a tool, we distilled the Clinical Genome Resource framework of causality and the American College of Medical Genetics/Association of Molecular Pathology and Quest Diagnostic Laboratory scoring of variant deleteriousness into the Clinical Variant Analysis Tool (CVAT). Applying this to 289 clinical exome reports, we compared the performance of junior practitioners with that of experienced medical geneticists and assessed the utility of reported variants. RESULTS: CVAT enabled performance comparable to that of experienced medical geneticists. In total, 124 of 289 (42.9%) exome reports and 146 of 382 (38.2%) reported variants supported a diagnosis. Overall, 10.5% (1 pathogenic [P] or likely pathogenic [LP] variant and 39 variants of uncertain significance [VUS]) of variants were reported in genes without established disease association; 20.2% (23 P/LP and 54 VUS) were in genes without sufficient phenotypic concordance; 7.3% (15 P/LP and 13 VUS) conflicted with the known molecular disease mechanism; and 24% (91 VUS) had insufficient evidence for deleteriousness. CONCLUSION: Implementation of CVAT standardized clinical interpretation of genomic variation and emphasized the need for collaborative and transparent reporting of genomic variation.


Assuntos
Testes Genéticos , Variação Genética , Exoma , Testes Genéticos/métodos , Variação Genética/genética , Genômica/métodos , Humanos , Sequenciamento do Exoma
6.
J Med Internet Res ; 23(3): e21023, 2021 03 16.
Artigo em Inglês | MEDLINE | ID: mdl-33724192

RESUMO

BACKGROUND: 16p13.11 microduplication syndrome has a variable presentation and is characterized primarily by neurodevelopmental and physical phenotypes resulting from copy number variation at chromosome 16p13.11. Given its variability, there may be features that have not yet been reported. The goal of this study was to use a patient "self-phenotyping" survey to collect data directly from patients to further characterize the phenotypes of 16p13.11 microduplication syndrome. OBJECTIVE: This study aimed to (1) discover self-identified phenotypes in 16p13.11 microduplication syndrome that have been underrepresented in the scientific literature and (2) demonstrate that self-phenotyping tools are valuable sources of data for the medical and scientific communities. METHODS: As part of a large study to compare and evaluate patient self-phenotyping surveys, an online survey tool, Phenotypr, was developed for patients with rare disorders to self-report phenotypes. Participants with 16p13.11 microduplication syndrome were recruited through the Boston Children's Hospital 16p13.11 Registry. Either the caregiver, parent, or legal guardian of an affected child or the affected person (if aged 18 years or above) completed the survey. Results were securely transferred to a Research Electronic Data Capture database and aggregated for analysis. RESULTS: A total of 19 participants enrolled in the study. Notably, among the 19 participants, aggression and anxiety were mentioned by 3 (16%) and 4 (21%) participants, respectively, which is an increase over the numbers in previously published literature. Additionally, among the 19 participants, 3 (16%) had asthma and 2 (11%) had other immunological disorders, both of which have not been previously described in the syndrome. CONCLUSIONS: Several phenotypes might be underrepresented in the previous 16p13.11 microduplication literature, and new possible phenotypes have been identified. Whenever possible, patients should continue to be referenced as a source of complete phenotyping data on their condition. Self-phenotyping may lead to a better understanding of the prevalence of phenotypes in genetic disorders and may identify previously unreported phenotypes.


Assuntos
Variações do Número de Cópias de DNA , Família , Variação Biológica da População , Estudos de Coortes , Humanos , Fenótipo
7.
Patterns (N Y) ; 2(1): 100155, 2021 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-33196056

RESUMO

Integrated, up-to-date data about SARS-CoV-2 and COVID-19 is crucial for the ongoing response to the COVID-19 pandemic by the biomedical research community. While rich biological knowledge exists for SARS-CoV-2 and related viruses (SARS-CoV, MERS-CoV), integrating this knowledge is difficult and time-consuming, since much of it is in siloed databases or in textual format. Furthermore, the data required by the research community vary drastically for different tasks; the optimal data for a machine learning task, for example, is much different from the data used to populate a browsable user interface for clinicians. To address these challenges, we created KG-COVID-19, a flexible framework that ingests and integrates heterogeneous biomedical data to produce knowledge graphs (KGs), and applied it to create a KG for COVID-19 response. This KG framework also can be applied to other problems in which siloed biomedical data must be quickly integrated for different research applications, including future pandemics.

8.
Nucleic Acids Res ; 48(D1): D704-D715, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31701156

RESUMO

In biology and biomedicine, relating phenotypic outcomes with genetic variation and environmental factors remains a challenge: patient phenotypes may not match known diseases, candidate variants may be in genes that haven't been characterized, research organisms may not recapitulate human or veterinary diseases, environmental factors affecting disease outcomes are unknown or undocumented, and many resources must be queried to find potentially significant phenotypic associations. The Monarch Initiative (https://monarchinitiative.org) integrates information on genes, variants, genotypes, phenotypes and diseases in a variety of species, and allows powerful ontology-based search. We develop many widely adopted ontologies that together enable sophisticated computational analysis, mechanistic discovery and diagnostics of Mendelian diseases. Our algorithms and tools are widely used to identify animal models of human disease through phenotypic similarity, for differential diagnostics and to facilitate translational research. Launched in 2015, Monarch has grown with regards to data (new organisms, more sources, better modeling); new API and standards; ontologies (new Mondo unified disease ontology, improvements to ontologies such as HPO and uPheno); user interface (a redesigned website); and community development. Monarch data, algorithms and tools are being used and extended by resources such as GA4GH and NCATS Translator, among others, to aid mechanistic discovery and diagnostics.


Assuntos
Biologia Computacional/métodos , Genótipo , Fenótipo , Algoritmos , Animais , Ontologias Biológicas , Bases de Dados Genéticas , Exoma , Estudos de Associação Genética , Variação Genética , Genômica , Humanos , Internet , Software , Pesquisa Translacional Biomédica , Interface Usuário-Computador
9.
Database (Oxford) ; 20192019 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31735951

RESUMO

While abnormalities related to carbohydrates (glycans) are frequent for patients with rare and undiagnosed diseases as well as in many common diseases, these glycan-related phenotypes (glycophenotypes) are not well represented in knowledge bases (KBs). If glycan-related diseases were more robustly represented and curated with glycophenotypes, these could be used for molecular phenotyping to help to realize the goals of precision medicine. Diagnosis of rare diseases by computational cross-species comparison of genotype-phenotype data has been facilitated by leveraging ontological representations of clinical phenotypes, using Human Phenotype Ontology (HPO), and model organism ontologies such as Mammalian Phenotype Ontology (MP) in the context of the Monarch Initiative. In this article, we discuss the importance and complexity of glycobiology and review the structure of glycan-related content from existing KBs and biological ontologies. We show how semantically structuring knowledge about the annotation of glycophenotypes could enhance disease diagnosis, and propose a solution to integrate glycophenotypes and related diseases into the Unified Phenotype Ontology (uPheno), HPO, Monarch and other KBs. We encourage the community to practice good identifier hygiene for glycans in support of semantic analysis, and clinicians to add glycomics to their diagnostic analyses of rare diseases.


Assuntos
Doença , Glicômica , Semântica , Animais , Humanos , Bases de Conhecimento , Fenótipo , Polissacarídeos/metabolismo
11.
Nucleic Acids Res ; 45(D1): D712-D722, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899636

RESUMO

The correlation of phenotypic outcomes with genetic variation and environmental factors is a core pursuit in biology and biomedicine. Numerous challenges impede our progress: patient phenotypes may not match known diseases, candidate variants may be in genes that have not been characterized, model organisms may not recapitulate human or veterinary diseases, filling evolutionary gaps is difficult, and many resources must be queried to find potentially significant genotype-phenotype associations. Non-human organisms have proven instrumental in revealing biological mechanisms. Advanced informatics tools can identify phenotypically relevant disease models in research and diagnostic contexts. Large-scale integration of model organism and clinical research data can provide a breadth of knowledge not available from individual sources and can provide contextualization of data back to these sources. The Monarch Initiative (monarchinitiative.org) is a collaborative, open science effort that aims to semantically integrate genotype-phenotype data from many species and sources in order to support precision medicine, disease modeling, and mechanistic exploration. Our integrated knowledge graph, analytic tools, and web services enable diverse users to explore relationships between phenotypes and genotypes across species.


Assuntos
Bases de Dados Genéticas , Estudos de Associação Genética/métodos , Genótipo , Fenótipo , Animais , Evolução Biológica , Biologia Computacional/métodos , Curadoria de Dados , Humanos , Ferramenta de Busca , Software , Especificidade da Espécie , Interface Usuário-Computador , Navegador
12.
Genetics ; 203(4): 1491-5, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27516611

RESUMO

The principles of genetics apply across the entire tree of life. At the cellular level we share biological mechanisms with species from which we diverged millions, even billions of years ago. We can exploit this common ancestry to learn about health and disease, by analyzing DNA and protein sequences, but also through the observable outcomes of genetic differences, i.e. phenotypes. To solve challenging disease problems we need to unify the heterogeneous data that relates genomics to disease traits. Without a big-picture view of phenotypic data, many questions in genetics are difficult or impossible to answer. The Monarch Initiative (https://monarchinitiative.org) provides tools for genotype-phenotype analysis, genomic diagnostics, and precision medicine across broad areas of disease.


Assuntos
Biologia Computacional , Estudos de Associação Genética , Genômica , Medicina de Precisão , Bases de Dados Genéticas , Humanos , Análise de Sequência de DNA , Análise de Sequência de Proteína
13.
Hum Mutat ; 36(10): 979-84, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26269093

RESUMO

The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases.


Assuntos
Bases de Dados Genéticas , Doença/genética , Predisposição Genética para Doença/genética , Animais , Modelos Animais de Doenças , Variação Genética , Humanos , Disseminação de Informação , Fenótipo , Interface Usuário-Computador
14.
Pathog Dis ; 68(2): 39-43, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23661595

RESUMO

Helicobacter pylori, inhabitant of the gastric mucosa of over half of the world population, with decreasing prevalence in the U.S., has been associated with a variety of gastric pathologies. However, the majority of H. pylori-infected individuals remain asymptomatic, and negative correlations between H. pylori and allergic diseases have been reported. Comprehensive genome characterization of H. pylori populations from different human host backgrounds including healthy individuals provides the exciting potential to generate new insights into the open question whether human health outcome is associated with specific H. pylori genotypes or dependent on other environmental factors. We report the genome sequences of 65 H. pylori isolates from individuals with gastric cancer, preneoplastic lesions, peptic ulcer disease, gastritis, and from asymptomatic adults. Isolates were collected from multiple locations in North America (USA and Canada) as well as from Columbia and Japan. The availability of these H. pylori genome sequences from individuals with distinct clinical presentations provides the research community with a resource for detailed investigations into genetic elements that correlate either positively or negatively with the epidemiology, human host adaptation, and gastric pathogenesis and will aid in the characterization of strains that may favor the development of specific pathology, including gastric cancer.


Assuntos
DNA Bacteriano/química , DNA Bacteriano/genética , Genoma Bacteriano , Infecções por Helicobacter/microbiologia , Helicobacter pylori/genética , Análise de Sequência de DNA , Adulto , Doenças Assintomáticas , Análise por Conglomerados , Colômbia , Gastrite/microbiologia , Infecções por Helicobacter/patologia , Helicobacter pylori/isolamento & purificação , Helicobacter pylori/patogenicidade , Humanos , Japão , Dados de Sequência Molecular , América do Norte , Úlcera Péptica/microbiologia , Filogenia , Neoplasias Gástricas/microbiologia
15.
J Bacteriol ; 194(19): 5450, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22965080

RESUMO

Mycobacterium massiliense (Mycobacterium abscessus group) is an emerging pathogen causing pulmonary disease and skin and soft tissue infections. We report the genome sequence of the type strain CCUG 48898.


Assuntos
Infecções por Mycobacterium/microbiologia , Mycobacterium/classificação , Mycobacterium/genética , Genoma Bacteriano , Humanos , Dados de Sequência Molecular , Filogenia , Polimorfismo de Nucleotídeo Único
16.
J Bacteriol ; 194(19): 5469, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22965092

RESUMO

Enterobacter radicincitans sp. nov. DSM16656(T) represents a new species of the genus Enterobacter which is a biological nitrogen-fixing endophytic bacterium with growth-promoting effects on a variety of crop and model plant species. The presence of genes for nitrogen fixation, phosphorous mobilization, and phytohormone production reflects this microbe's potential plant growth-promoting activity.


Assuntos
Enterobacter/classificação , Enterobacter/genética , Genoma Bacteriano , Dados de Sequência Molecular , Raízes de Plantas/crescimento & desenvolvimento , Raízes de Plantas/microbiologia , Brotos de Planta/crescimento & desenvolvimento , Brotos de Planta/microbiologia , Microbiologia do Solo , Triticum/crescimento & desenvolvimento
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...